Federated learning (FL) enables distributed model training from local data collected by users. In distributed systems with constrained resources and potentially high dynamics, e.g., mobile edge networks, the efficiency of FL is an important problem. Existing works have separately considered different configurations to make FL more efficient, such as infrequent transmission of model updates, client subsampling, and compression of update vectors. However, an important open problem is how to jointly apply and tune these control knobs in a single FL algorithm, to achieve the best performance by allowing a high degree of freedom in control decisions. In this paper, we address this problem and propose FlexFL - an FL algorithm with multiple options that can be adjusted flexibly. Our FlexFL algorithm allows both arbitrary rates of local computation at clients and arbitrary amounts of communication between clients and the server, making both the computation and communication resource consumption adjustable. We prove a convergence upper bound of this algorithm. Based on this result, we further propose a stochastic optimization formulation and algorithm to determine the control decisions that (approximately) minimize the convergence bound, while conforming to constraints related to resource consumption. The advantage of our approach is also verified using experiments.
translated by 谷歌翻译
Federated learning (FL) is a key enabler for efficient communication and computing, leveraging devices' distributed computing capabilities. However, applying FL in practice is challenging due to the local devices' heterogeneous energy, wireless channel conditions, and non-independently and identically distributed (non-IID) data distributions. To cope with these issues, this paper proposes a novel learning framework by integrating FL and width-adjustable slimmable neural networks (SNN). Integrating FL with SNNs is challenging due to time-varying channel conditions and data distributions. In addition, existing multi-width SNN training algorithms are sensitive to the data distributions across devices, which makes SNN ill-suited for FL. Motivated by this, we propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models. By applying SC, SlimFL exchanges the superposition of multiple-width configurations decoded as many times as possible for a given communication throughput. Leveraging ST, SlimFL aligns the forward propagation of different width configurations while avoiding inter-width interference during backpropagation. We formally prove the convergence of SlimFL. The result reveals that SlimFL is not only communication-efficient but also deals with non-IID data distributions and poor channel conditions, which is also corroborated by data-intensive simulations.
translated by 谷歌翻译
移动设备是大数据的不可或缺的来源。联合学习(FL)通过交换本地培训的模型而不是其原始数据来利用这些私人数据具有很大的潜力。然而,移动设备通常是能量有限且无线连接的,并且FL不能灵活地应对它们的异构和时变的能量容量和通信吞吐量,限制采用。通过这些问题,我们提出了一种新颖的能源和通信有效的流动框架,被创造的Slimfl。为了解决异构能量容量问题,SLIMFL中的每个设备都运行宽度可调可泥瓦神经网络(SNN)。为了解决异构通信吞吐量问题,每个全宽(1.0倍)SNN模型及其半宽度(0.5美元$ x)模型在传输之前是叠加编码的,并且在接收后连续解码为0.5x或1.0美元$ 1.0 $ x模型取决于频道质量。仿真结果表明,SLIMFL可以通过合理的精度和收敛速度同时培养0.5美元和1.0美元的X模型,而是使用2美元的通信资源分别培训这两种型号。令人惊讶的是,SLIMFL甚至具有比Vanilla FL的较低的能量占地面积更高的精度,对于较差的通道和非IID数据分布,Vanilla Fl会缓慢收敛。
translated by 谷歌翻译
本文旨在整合两个协同技术,联合学习(FL)和宽度可调的可泥质网络(SNN)架构。通过交换当地培训的移动设备模型来保留数据隐私。通过采用SNNS作为本地模型,FL可以灵活地应对移动设备的时变能容量。然而,结合FL和SNN是非琐碎的,特别是在与时变通道条件的无线连接下。此外,现有的多宽SNN训练算法对跨设备的数据分布敏感,因此不适用于FL。由此激励,我们提出了一种通信和节能SNN的FL(命名SLIMFL),共同利用叠加编码(SC)进行全局模型聚合和叠加训练(ST),以更新本地模型。通过施加SC,SLIMFL交换多个宽度配置的叠加,这对于给定的通信吞吐量尽可能多地解码。利用ST,SLIMFL对准不同宽度配置的前向传播,同时避免在背部衰退期间的横宽干扰。我们正式证明了Slimfl的融合。结果表明,SLIMFL不仅是通信的,而且可以抵消非IID数据分布和差的信道条件,这也被模拟证实。
translated by 谷歌翻译
Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works on robustness in text or code tasks have focused on classification, while robustness in generation tasks is an uncharted area and to date there is no comprehensive benchmark for robustness in code generation. In this paper, we propose ReCode, a comprehensive robustness evaluation benchmark for code generation models. We customize over 30 transformations specifically for code on docstrings, function and variable names, code syntax, and code format. They are carefully designed to be natural in real-life coding practice, preserve the original semantic meaning, and thus provide multifaceted assessments of a model's robustness performance. With human annotators, we verified that over 90% of the perturbed prompts do not alter the semantic meaning of the original prompt. In addition, we define robustness metrics for code generation models considering the worst-case behavior under each type of perturbation, taking advantage of the fact that executing the generated code can serve as objective evaluation. We demonstrate ReCode on SOTA models using HumanEval, MBPP, as well as function completion tasks derived from them. Interesting observations include: better robustness for CodeGen over InCoder and GPT-J; models are most sensitive to syntax perturbations; more challenging robustness evaluation on MBPP over HumanEval.
translated by 谷歌翻译
Recognizing useful named entities plays a vital role in medical information processing, which helps drive the development of medical area research. Deep learning methods have achieved good results in medical named entity recognition (NER). However, we find that existing methods face great challenges when dealing with the nested named entities. In this work, we propose a novel method, referred to as ASAC, to solve the dilemma caused by the nested phenomenon, in which the core idea is to model the dependency between different categories of entity recognition. The proposed method contains two key modules: the adaptive shared (AS) part and the attentive conditional random field (ACRF) module. The former part automatically assigns adaptive weights across each task to achieve optimal recognition accuracy in the multi-layer network. The latter module employs the attention operation to model the dependency between different entities. In this way, our model could learn better entity representations by capturing the implicit distinctions and relationships between different categories of entities. Extensive experiments on public datasets verify the effectiveness of our method. Besides, we also perform ablation analyses to deeply understand our methods.
translated by 谷歌翻译
非接触式粒子操纵(NPM)技术将人类的分析能力大大扩展到了微观和纳米量表,这反过来又大大促进了材料科学和生命科学的发展。尽管从机器人的角度来看,通过电力,磁性和光场取得了巨大的成功,但它仍然是劳动密集型操作,因为在早期准备阶段,专业人力援助以某种方式是强制性的。因此,出现运动颗粒的自动非接触夹捕获是值得的,特别是对于粒子样品罕见,脆弱或接触敏感的应用。利用最新的动态声场调节技术,尤其是通过从微尺度到亚中心尺度的声学操纵的巨大可扩展性,我们提出了一个自动化的非接触式微粒诱捕,该非接触式捕获具有超声梯级系统和显微镜系统和显微镜系统的移动微粒本文的视觉。据我们所知,这项工作的主要贡献是首次通过诉诸机器人方法来实现声学NPM场中完全自动化的微颗粒捕获。简而言之,通过参考其计算和生成的声学陷阱区域来观察并通过双眼微观视觉系统观察并预测粒子的移动状态。在这项工作中,非连接机器人最终效应器的手眼关系问题也解决了。实验证明了这项工作的有效性。
translated by 谷歌翻译
现有的DERANE方法主要集中于单个输入图像。只有单个输入图像,很难准确检测到雨条,去除雨条并恢复无雨图像。与单个2D图像相比,光场图像(LFI)通过通过元素摄像机记录每个事件射线的方向和位置,嵌入了广泛的3D结构和纹理信息,该镜头已成为计算机中的流行设备视觉和图形研究社区。在本文中,我们提出了一个新颖的网络4D-MGP-SRRNET,以从LFI中删除雨条。我们的方法将大雨LFI的所有子视图作为输入。为了充分利用LFI,我们采用4D卷积层来构建拟议的雨牛排清除网络,以同时处理LFI的所有子视图。在拟议的网络中,提出了带有新颖的多尺度自引导高斯工艺(MSGP)模块的雨水检测模型MGPDNET,以检测输入LFI的所有子视图中的雨条。引入了半监督的学习,以通过对虚拟世界LFI和现实世界中的LFI进行多个尺度上的虚拟世界LFI和现实世界中的LFI来准确检测雨季,这是通过计算现实世界中雨水条纹的伪地面真相。然后,所有减去预测的雨条的子视图都将馈送到4D残差模型中,以估计深度图。最后,所有子视图与相应的雨条和从估计的深度图转换的相应雨条和雾图都馈送到基于对抗性复发性神经网络的雨天LFI恢复模型,以逐步消除雨水条纹并恢复无雨的LFI LFI LFI。 。对合成LFI和现实世界LFI进行的广泛的定量和定性评估证明了我们提出的方法的有效性。
translated by 谷歌翻译
由于SARS-COV-2(COVID-19)病毒的快速发展,许多突变发生了许多变体,例如Alpha,Gamma,Delta和Omicron,对世界经济产生了巨大影响。无监督的机器学习方法具有压缩,表征和可视化数据的能力。在本文中,我们提出了一个框架,该框架利用了无监督的机器学习方法,其中包括选定的尺寸还原和聚类方法的组合,以区分和可视化基于基于基因组序列的主要COVID-19变体的关联。该框架利用K-MER分析来处理基因组(RNA)序列,并比较包括主成分分析(PCA)和T-分布的随机邻居嵌入(T-SNE)和统一歧管近似投影( UMAP)。此外,该框架采用了团聚层次聚类方法,并使用树状图提供了可视化。我们发现所提出的框架可以有效地区分主要变体,因此可以在将来区分新兴变体。
translated by 谷歌翻译
随机点产品图(RDPG)是网络的生成模型,其中顶点对应于潜像欧几里德空间中的位置,并且由潜在位置的点产品确定。我们考虑从潜在空间的未知$ 1 $ 1多维二维子段中随机采样潜在位置的RDPG。原则上,限制推理,即利用子苗条结构的程序,应该比不受限制的推断更有效;然而,当子苗条未知时,尚不清楚如何进行限制推理。我们提出了用于歧管学习的技术可用于学习空气的未知子多种,以实现从受限推断的益处。为了说明,我们使用完整的一组顶点来测试1美元的FR \'{e} CHET手段的1美元 - 和2美元的假设,以推断潜伏结构。我们建议测试统计数据,用于使用从估计的潜在位置构造的邻域图上的最短路径距离来部署ISOMAP过程,以估计未知$ 1 $ -dimenmanifold上的弧长。与ISOMAP的常规应用不同,估计的潜在位置不介绍感兴趣的子群。我们将现有的收敛结果扩展到ISOMAP到此设置,并使用它们来证明,随着辅助顶点的数量增加,我们的测试的功率会收敛于当已知子纤维的相应测试的功率。最后,我们将方法应用于推理问题,这是在研究果蝇幼虫蘑菇体的结核时。单变量学习歧管测试拒绝($ P <0.05 $),而多变量环境空间测试没有($ p \ gg0.05 $),说明了识别和利用后续推断的低维结构的值。
translated by 谷歌翻译